operations
Operations
1. groupBy
Syntax: groupBy(fieldsArr, reducers)
parameters:
- fieldsArr:
required: true
type: Array of string
descriptions: Array containing the name of dimensions using which groupBy should happen.
- reducers:
required: false
type: Array of Array,
default: [],
description : A simple array of (array pair) whose 0th index is the variable name and 1st
index is the name of the Aggregation function.
Groups the data using particular dimensions by reducing measures. It expects a list of dimensions using which it projects the DataModel and perform aggregations to reduce the duplicate tuples.
DataModel by default provides Aggregation Functions to aggregate grouped measure value.
Returns: DataModel
(Returns a new DataModel
instance after performing the groupBy
)
const Datamodel = muze.DataModel;
const data = [
{
Maker: "chevrolet",
Name: "chevrolet chevelle malibu",
Miles_per_Gallon: 18,
Cylinders: 8,
Displacement: 307,
Horsepower: 130,
Weight_in_lbs: 3504,
Acceleration: 12,
Year: "1970-01-01",
Origin: "USA",
},
{
Maker: "buick",
Name: "buick skylark 320",
Miles_per_Gallon: 15,
Cylinders: 8,
Displacement: -350,
Horsepower: 165,
Weight_in_lbs: 3693,
Acceleration: 11.5,
Year: "1970-01-01",
Origin: "USA",
},
// ... and so on...
];
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Maker",
type: "dimension",
},
{
name: "Miles_per_Gallon",
type: "measure",
defAggFn: "avg",
},
{
name: "Displacement",
type: "measure",
defAggFn: "sum",
},
{
name: "Horsepower",
type: "measure",
defAggFn: "sum",
},
{
name: "Weight_in_lbs",
type: "measure",
defAggFn: "min",
},
{
name: "Acceleration",
type: "measure",
defAggFn: "sum",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
{
name: "Year",
type: "dimension",
subtype: "temporal",
format: "%Y-%m-%d",
},
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.groupBy(
["Year"],
["Horsepower", Datamodel.AggregationFunctions.MAX],
);
Printing the output outputDM
gives:
Year | Miles_per_Gallon | Displacement | Horsepower | Weight_in_lbs | Acceleration |
---|---|---|---|---|---|
-19800000 | 18 | 455 | 147.55882352941177 | 1835 | 12.544117647058824 |
3151620000 | 21.25 | 400 | 104.92857142857143 | 1613 | 15.310344827586206 |
6305220000 | 18.714285714285715 | 429 | 120.17857142857143 | 2100 | 15.125 |
9467460000 | 17.1 | 455 | 130.475 | 1867 | 14.3125 |
12621060000 | 22.703703703703702 | 350 | 94.23076923076923 | 1649 | 16.203703703703702 |
2. sort
Syntax: sort(sortingDetails)
parameters:
- sortingDetails:
required: true
type: Array of Array
descriptions: Sorting details based on which the sorting will be performed.
Performs sorting according to the specified sorting details.Like every other operator it doesn't mutate the current DataModel instance on which it was called, instead returns a new DataModel instance containing the sorted data.
DataModel support multi level sorting by listing the variables using which sorting needs to be performed and the type of sorting ASC
or DESC
.
Returns: DataModel (Returns a new instance of DataModel with sorted data)
In the following example, data is sorted by Origin
field in DESC
order in first level followed by another level of sorting by Acceleration
in ASC
order.
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
//... Cars Schema as shown in above example ...
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.sort([
["Origin", "desc"],
["Acceleration"], // Default value is ASC
]);
Printing the outputDM
gives:
Name | Maker | Miles_per_Gallon | Displacement | Horsepower | Weight_in_lbs | Acceleration | Origin | Cylinders | Year |
---|---|---|---|---|---|---|---|---|---|
plymouth 'cuda 340 | plymouth | 14 | 340 | 160 | 3609 | 8 | USA | 8 | -19800000 |
ford mustang boss 302 | ford | NaN | 302 | 140 | 3353 | 8 | USA | 8 | -19800000 |
plymouth fury iii | plymouth | 14 | 440 | 215 | 4312 | 8.5 | USA | 8 | -19800000 |
amc ambassador dpl | amc | 15 | 390 | 190 | 3850 | 8.5 | USA | 8 | -19800000 |
chevrolet impala | chevrolet | 14 | 454 | 220 | 4354 | 9 | USA | 8 | -19800000 |
3. calculateVariable
Syntax: calculateVariable(params)
parameters:
- schema:
required: true
type: JSON
description : Schema of newly defined variable
- fieldName(s):
required: true
type: Array
description : Array of previous schema variable names
- resolverFunction:
required: true
type: Array
description : A function to define, how every value of the new field is generated by each
values of the field names, ie the previous params
Returns: DataModel (Instance of DataModel with the new field)
Creates a new variable calculated from existing variable. This method expects definition of the newly created variable and a function which resolves value of the new variable from existing variables.
Creates a new measure based on existing variables:
Example 1
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
{
name: "Horsepower",
type: "measure",
defAggFn: "avg",
},
{
name: "Weight_in_lbs",
type: "measure",
defAggFn: "min",
},
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.calculateVariable(
{
name: "powerToWeight",
type: "measure", // Schema of variable
},
["Horsepower", "Weight_in_lbs"],
(hp, weight) => hp / weight,
);
Original DataModel
Origin | Cylinders | Horsepower | Weight_in_lbs |
---|---|---|---|
USA | 8 | 130 | 3504 |
USA | 8 | 165 | 3693 |
USA | 8 | 150 | 3436 |
USA | 8 | 150 | 3433 |
USA | 8 | 140 | 3449 |
New DataModel
Origin | Cylinders | Horsepower | Weight_in_lbs | powerToWeight |
---|---|---|---|---|
USA | 8 | 130 | 3504 | 0.037100456621004564 |
USA | 8 | 165 | 3693 | 0.04467912266450041 |
USA | 8 | 150 | 3436 | 0.043655413271245634 |
USA | 8 | 150 | 3433 | 0.043693562481794346 |
USA | 8 | 140 | 3449 | 0.0405914757900840 |
Example 2
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.calculateVariable(
{
name: "Efficiency",
type: "dimension",
},
["Horsepower"],
(hp) => {
if (hp < 80) {
return "low";
} else if (hp < 120) {
return "moderate";
} else {
return "high";
}
},
);
Printing outputDM
gives:
Name | Maker | Miles_per_Gallon | Displacement | Horsepower | Weight_in_lbs | Acceleration | Origin | Cylinders | Year | Efficiency |
---|---|---|---|---|---|---|---|---|---|---|
chevrolet chevelle malibu | chevrolet | 18 | 307 | 130 | 3504 | 12 | USA | 8 | -19800000 | high |
buick skylark 320 | buick | 15 | 350 | 165 | 3693 | 11.5 | USA | 8 | -19800000 | high |
plymouth satellite | plymouth | 18 | 318 | 150 | 3436 | 11 | USA | 8 | -19800000 | high |
amc rebel sst | amc | 16 | 304 | 150 | 3433 | 12 | USA | 8 | -19800000 | high |
ford torino | ford | 17 | 302 | 140 | 3449 | 10.5 | USA | 8 | -19800000 | high |
4. select
Syntax: select(conditions, config)
parameters:
- conditions:
required: true
type: Object
descriptions: An Object to govern the selection of values from the data
default: {}
parameters: SelectionParameters
- config:
required: false
type: Object
default: {}
description : The configuration object to control the inclusion exclusion of a row in resultant DataModel instance.
parameters:
- mode:
required: true
type: FilteringModes
descriptions: The mode of the selection
Comparison Operations
parameters:
- field:
required: true
type: string
descriptions: The field name to compare
- operator:
required: true
type: ComparisonOperator
descriptions: The comparison operation to be done
- value:
required: true
type: SupportedDataTypes | array<SupportedDataTypes>
descriptions: The value to be compared with
Logical Operations
parameters:
- operator:
required: true
type: LogicalOperator
descriptions: The Operation with with all the Comparison Operations are connected
- conditions:
required: true
type: array<ComparisonOperations>
descriptions: the lis of comparison operations
SupportedDataTypes: string | number | null | undefined
FilteringModes operates on the selection and rejection set to determine which one would reflect in the resultant datamodel. The Filtering modes are:
- INVERSE
- NORMAL
- ALL
Note: Selection and rejection set is only a logical idea for concept explanation purpose.
Returns: DataModel (Returns an instance of DataModel with selected data according to the field names)
Example 1
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
];
const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const { EQUAL } = Datamodel.ComparisonOperators;
const { AND, OR } = Datamodel.LogicalOperators;
const selectedDM = dm.select({
field: "Origin",
value: "Japan",
operator: EQUAL,
});
Printing selectedDM
gives:
Name | Origin | Cylinders |
---|---|---|
toyota corona mark ii | Japan | 4 |
datsun pl510 | Japan | 4 |
datsun pl510 | Japan | 4 |
toyota corona | Japan | 4 |
toyota corolla 1200 | Japan | 4 |
Example 2
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
];
const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const { EQUAL } = Datamodel.ComparisonOperators;
const { AND, OR } = Datamodel.LogicalOperators;
const selectedDM = dm.select({
conditions: [
{ field: "Origin", value: "Japan", operator: EQUAL },
{
conditions: [
{ field: "Cylinders", value: "3", operator: EQUAL },
{ field: "Cylinders", value: "6", operator: EQUAL },
{ field: "Cylinders", value: "8", operator: EQUAL },
],
operator: OR,
},
],
operator: AND,
});
Printing selectedDM
gives:
Name | Origin | Cylinders |
---|---|---|
mazda rx2 coupe | Japan | 3 |
maxda rx3 | Japan | 3 |
toyota mark ii | Japan | 6 |
toyota mark ii | Japan | 6 |
datsun 810 | Japan | 6 |
5. project
Syntax: project(projField, config)
parameters:
- projField:
required: true
type: Array<(string|Regexp)>
descriptions: An array of column names in string or regular expression.
- config:
required: false
type: Object
default: {}
description : The configuration object to control the creation of new DataModel.
parameters:
- mode:
required: true
type: FilteringModes
descriptions: Mode of the projection
This is functional version of projection operator. Projection is a column (field) filtering operation. It expects list of fields name and either include those or exclude those based on FilteringMode on the resultant dataModel. It returns a function which is called with the DataModel instance on which the action needs to be performed.
Projection expects array of fields name based on which it creates the selection and rejection set. All the field whose name is present in array goes in selection set and rest of the fields goes in rejection set.
FilteringModes operates on the selection and rejection set to determine which one would reflect in the resultant datamodel.
Note: Selection and rejection set is only a logical idea for concept explanation purpose.
Returns: DataModel (Returns an instance of DataModel with project data according to the field names)
Example:
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
];
const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
outputDM = dm.project(["Name"], { mode: Datamodel.FilteringModes.INVERSE });
Printing outputDM
gives:
Origin | Cylinders |
---|---|
USA | 8 |
USA | 8 |
USA | 8 |
USA | 8 |
USA | 8 |
6. splitByRow
Syntax: splitByRow(fields)
parameters:
- fieldNames:
required: true
type: Array<(string)>
descriptions: An array of column names in string.
Returns: DataModel (Returns an array of instances of DataModel with split according to the field names)
Example 1
This is the method that is used to split into groups of unique combinations of the values of the fields. For example : If for the cars data we have been using in this section, if we split by Origin
, it will be split into an array of three datamodels, each for USA
, Japan
and European Union
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Acceleration",
type: "measure",
defAggFn: "avg",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
];
const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const outputDm = dm.splitByRow(["Origin"]);
for (let i = 0; i < outputDm.length; i++) {
printDM(outputDm[i]);
}
Note: printDM is a utility function to render datamodel on a webpage for demonstration putpose only.
The output gives 3 datamodels as shown below:
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
citroen ds-21 pallas | 17.5 | Europe | 4 |
volkswagen 1131 deluxe sedan | 20.5 | Europe | 4 |
peugeot 504 | 17.5 | Europe | 4 |
audi 100 ls | 14.5 | Europe | 4 |
saab 99e | 17.5 | Europe | 4 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
chevrolet chevelle malibu | 12 | USA | 8 |
buick skylark 320 | 11.5 | USA | 8 |
plymouth satellite | 11 | USA | 8 |
amc rebel sst | 12 | USA | 8 |
ford torino | 10.5 | USA | 8 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
toyota corona mark ii | 15 | Japan | 4 |
datsun pl510 | 14.5 | Japan | 4 |
datsun pl510 | 14.5 | Japan | 4 |
toyota corona | 14 | Japan | 4 |
toyota corolla 1200 | 19 | Japan | 4 |
Example 2
Similarly if the data can me split by unique combinations of more than one field values. Considering, the following sample, when the data id split by Origin
and Cylinders
, the number of data models generated is 9 because the unique groups formed by every values of Origin and Cylinders is 9:
const Datamodel = muze.DataModel;
const data = {
//... Cars Data as shown in above example ...
};
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Acceleration",
type: "measure",
defAggFn: "avg",
},
{
name: "Origin",
type: "dimension",
},
{
name: "Cylinders",
type: "dimension",
},
];
const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const outputDm = dm.splitByRow(["Origin", "Cylinders"]);
for (let i = 0; i < outputDm.length; i++) {
printTable(
outputDm[i].getData().data,
["Name", "Acceleration", "Origin", "Cylinders"],
{ rowLimit: 1 },
);
}
Note: printTable is a utility function to render datamodel on a webpage for demonstration putpose only.
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
chevrolet chevelle malibu | 12 | USA | 8 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
audi 5000 | 15.9 | Europe | 5 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
mercedes-benz 280s | 16.7 | Europe | 6 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
toyota mark ii | 13.5 | Japan | 6 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
mazda rx2 coupe | 13.5 | Japan | 3 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
chevrolet vega 2300 | 15.5 | USA | 4 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
plymouth duster | 15.5 | USA | 6 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
toyota corona mark ii | 15 | Japan | 4 |
Name | Acceleration | Origin | Cylinders |
---|---|---|---|
citroen ds-21 pallas | 17.5 | Europe | 4 |